Goto

Collaborating Authors

 Baltimore


Adversarially Robust Multi-task Representation Learning

Neural Information Processing Systems

We study adversarially robust transfer learning, wherein, given labeled data on multiple (source) tasks, the goal is to train a model with small robust error on a previously unseen (target) task. In particular, we consider a multi-task representation learning (MTRL) setting, i.e., we assume that the source and target tasks admit a simple (linear) predictor on top of a shared representation (e.g., the final hidden layer of a deep neural network). In this general setting, we provide rates on the excess adversarial (transfer) risk for Lipschitz losses and smooth nonnegative losses. These rates show that learning a representation using adversarial training on diverse tasks helps protect against inference-time attacks in data-scarce environments. Additionally, we provide novel rates for the single-task setting.


Dissecting the Interplay of Attention Paths in a Statistical Mechanics Theory of Transformers Lorenzo Tiberi Francesca Mignacco 3,4

Neural Information Processing Systems

Despite the remarkable empirical performance of transformers, their theoretical understanding remains elusive. Here, we consider a deep multi-head self-attention network, that is closely related to transformers yet analytically tractable. We develop a statistical mechanics theory of Bayesian learning in this model, deriving exact equations for the network's predictor statistics under the finite-width thermodynamic limit, i.e., N, P, P/N = O(1), where N is the network width and P is the number of training examples. Our theory shows that the predictor statistics are expressed as a sum of independent kernels, each one pairing different attention paths, defined as information pathways through different attention heads across layers. The kernels are weighted according to a task-relevant kernel combination mechanism that aligns the total kernel with the task labels. As a consequence, this interplay between attention paths enhances generalization performance. Experiments confirm our findings on both synthetic and real-world sequence classification tasks. Finally, our theory explicitly relates the kernel combination mechanism to properties of the learned weights, allowing for a qualitative transfer of its insights to models trained via gradient descent. As an illustration, we demonstrate an efficient size reduction of the network, by pruning those attention heads that are deemed less relevant by our theory.


Evaluating ASR Confidence Scores for Automated Error Detection in User-Assisted Correction Interfaces

arXiv.org Artificial Intelligence

Despite advances in Automatic Speech Recognition (ASR), transcription errors persist and require manual correction. Confidence scores, which indicate the certainty of ASR results, could assist users in identifying and correcting errors. This study evaluates the reliability of confidence scores for error detection through a comprehensive analysis of end-to-end ASR models and a user study with 36 participants. The results show that while confidence scores correlate with transcription accuracy, their error detection performance is limited. Classifiers frequently miss errors or generate many false positives, undermining their practical utility. Confidence-based error detection neither improved correction efficiency nor was perceived as helpful by participants. These findings highlight the limitations of confidence scores and the need for more sophisticated approaches to improve user interaction and explainability of ASR results.


From Guessing to Asking: An Approach to Resolving the Persona Knowledge Gap in LLMs during Multi-Turn Conversations

arXiv.org Artificial Intelligence

In multi-turn dialogues, large language models (LLM) face a critical challenge of ensuring coherence while adapting to user-specific information. This study introduces the persona knowledge gap, the discrepancy between a model's internal understanding and the knowledge required for coherent, personalized conversations. While prior research has recognized these gaps, computational methods for their identification and resolution remain underexplored. We propose Conversation Preference Elicitation and Recommendation (CPER), a novel framework that dynamically detects and resolves persona knowledge gaps using intrinsic uncertainty quantification and feedback-driven refinement. CPER consists of three key modules: a Contextual Understanding Module for preference extraction, a Dynamic Feedback Module for measuring uncertainty and refining persona alignment, and a Persona-Driven Response Generation module for adapting responses based on accumulated user context. We evaluate CPER on two real-world datasets: CCPE-M for preferential movie recommendations and ESConv for mental health support. Using A/B testing, human evaluators preferred CPER's responses 42% more often than baseline models in CCPE-M and 27% more often in ESConv. A qualitative human evaluation confirms that CPER's responses are preferred for maintaining contextual relevance and coherence, particularly in longer (12+ turn) conversations.


Evaluation of state-of-the-art deep learning models in the segmentation of the heart ventricles in parasternal short-axis echocardiograms

arXiv.org Artificial Intelligence

Previous studies on echocardiogram segmentation are focused on the left ventricle in parasternal long-axis views. In this study, deep-learning models were evaluated on the segmentation of the ventricles in parasternal short-axis echocardiograms (PSAX-echo). Segmentation of the ventricles in complementary echocardiogram views will allow the computation of important metrics with the potential to aid in diagnosing cardio-pulmonary diseases and other cardiomyopathies. Evaluating state-of-the-art models with small datasets can reveal if they improve performance on limited data. PSAX-echo were performed on 33 volunteer women. An experienced cardiologist identified end-diastole and end-systole frames from 387 scans, and expert observers manually traced the contours of the cardiac structures. Traced frames were pre-processed and used to create labels to train 2 specific-domain (Unet-Resnet101 and Unet-ResNet50), and 4 general-domain (3 Segment Anything (SAM) variants, and the Detectron2) deep-learning models. The performance of the models was evaluated using the Dice similarity coefficient (DSC), Hausdorff distance (HD), and difference in cross-sectional area (DCSA). The Unet-Resnet101 model provided superior performance in the segmentation of the ventricles with 0.83, 4.93 pixels, and 106 pixel2 on average for DSC, HD, and DCSA respectively. A fine-tuned MedSAM model provided a performance of 0.82, 6.66 pixels, and 1252 pixel2, while the Detectron2 model provided 0.78, 2.12 pixels, and 116 pixel2 for the same metrics respectively. Deep-learning models are suitable for the segmentation of the left and right ventricles in PSAX-echo. This study demonstrated that specific-domain trained models such as Unet-ResNet provide higher accuracy for echo segmentation than general-domain segmentation models when working with small and locally acquired datasets.


Robotic Ultrasound-Guided Femoral Artery Reconstruction of Anatomically-Representative Phantoms

arXiv.org Artificial Intelligence

Femoral artery access is essential for numerous clinical procedures, including diagnostic angiography, therapeutic catheterization, and emergency interventions. Despite its critical role, successful vascular access remains challenging due to anatomical variability, overlying adipose tissue, and the need for precise ultrasound (US) guidance. Errors in needle placement can lead to severe complications, restricting the procedure to highly skilled clinicians in controlled hospital settings. While robotic systems have shown promise in addressing these challenges through autonomous scanning and vessel reconstruction, clinical translation remains limited due to reliance on simplified phantom models that fail to capture human anatomical complexity. In this work, we present a method for autonomous robotic US scanning of bifurcated femoral arteries, and validate it on five vascular phantoms created from real patient computed tomography (CT) data. Additionally, we introduce a video-based deep learning US segmentation network tailored for vascular imaging, enabling improved 3D arterial reconstruction. The proposed network achieves a Dice score of 89.21% and an Intersection over Union of 80.54% on a newly developed vascular dataset. The quality of the reconstructed artery centerline is evaluated against ground truth CT data, demonstrating an average L2 deviation of 0.91+/-0.70 mm, with an average Hausdorff distance of 4.36+/-1.11mm. This study is the first to validate an autonomous robotic system for US scanning of the femoral artery on a diverse set of patient-specific phantoms, introducing a more advanced framework for evaluating robotic performance in vascular imaging and intervention.


The study of short texts in digital politics: Document aggregation for topic modeling

arXiv.org Artificial Intelligence

Statistical topic modeling is widely used in political science to study text. Researchers examine documents of varying lengths, from tweets to speeches. There is ongoing debate on how document length affects the interpretability of topic models. We investigate the effects of aggregating short documents into larger ones based on natural units that partition the corpus. In our study, we analyze one million tweets by U.S. state legislators from April 2016 to September 2020. We find that for documents aggregated at the account level, topics are more associated with individual states than when using individual tweets. This finding is replicated with Wikipedia pages aggregated by birth cities, showing how document definitions can impact topic modeling results.


Matrix Factorization for Inferring Associations and Missing Links

arXiv.org Artificial Intelligence

Missing link prediction is a method for network analysis, with applications in recommender systems, biology, social sciences, cybersecurity, information retrieval, and Artificial Intelligence (AI) reasoning in Knowledge Graphs. Missing link prediction identifies unseen but potentially existing connections in a network by analyzing the observed patterns and relationships. In proliferation detection, this supports efforts to identify and characterize attempts by state and non-state actors to acquire nuclear weapons or associated technology - a notoriously challenging but vital mission for global security. Dimensionality reduction techniques like Non-Negative Matrix Factorization (NMF) and Logistic Matrix Factorization (LMF) are effective but require selection of the matrix rank parameter, that is, of the number of hidden features, k, to avoid over/under-fitting. We introduce novel Weighted (WNMFk), Boolean (BNMFk), and Recommender (RNMFk) matrix factorization methods, along with ensemble variants incorporating logistic factorization, for link prediction. Our methods integrate automatic model determination for rank estimation by evaluating stability and accuracy using a modified bootstrap methodology and uncertainty quantification (UQ), assessing prediction reliability under random perturbations. We incorporate Otsu threshold selection and k-means clustering for Boolean matrix factorization, comparing them to coordinate descent-based Boolean thresholding. Our experiments highlight the impact of rank k selection, evaluate model performance under varying test-set sizes, and demonstrate the benefits of UQ for reliable predictions using abstention. We validate our methods on three synthetic datasets (Boolean and uniformly distributed) and benchmark them against LMF and symmetric LMF (symLMF) on five real-world protein-protein interaction networks, showcasing an improved prediction performance.


Advancing climate model interpretability: Feature attribution for Arctic melt anomalies

arXiv.org Artificial Intelligence

The focus of our work is improving the interpretability of anomalies in climate models and advancing our understanding of Arctic melt dynamics. The Arctic and Antarctic ice sheets are experiencing rapid surface melting and increased freshwater runoff, contributing significantly to global sea level rise. Understanding the mechanisms driving snowmelt in these regions is crucial. ERA5, a widely used reanalysis dataset in polar climate studies, offers extensive climate variables and global data assimilation. However, its snowmelt model employs an energy imbalance approach that may oversimplify the complexity of surface melt. In contrast, the Glacier Energy and Mass Balance (GEMB) model incorporates additional physical processes, such as snow accumulation, firn densification, and meltwater percolation/refreezing, providing a more detailed representation of surface melt dynamics. In this research, we focus on analyzing surface snowmelt dynamics of the Greenland Ice Sheet using feature attribution for anomalous melt events in ERA5 and GEMB models. We present a novel unsupervised attribution method leveraging counterfactual explanation method to analyze detected anomalies in ERA5 and GEMB. Our anomaly detection results are validated using MEaSUREs ground-truth data, and the attributions are evaluated against established feature ranking methods, including XGBoost, Shapley values, and Random Forest. Our attribution framework identifies the physics behind each model and the climate features driving melt anomalies. These findings demonstrate the utility of our attribution method in enhancing the interpretability of anomalies in climate models and advancing our understanding of Arctic melt dynamics.


The Design of On-Body Robots for Older Adults

arXiv.org Artificial Intelligence

Wearable technology has significantly improved the quality of life for older adults, and the emergence of on-body, movable robots presents new opportunities to further enhance well-being. Yet, the interaction design for these robots remains under-explored, particularly from the perspective of older adults. We present findings from a two-phase co-design process involving 13 older adults to uncover design principles for on-body robots for this population. We identify a rich spectrum of potential applications and characterize a design space to inform how on-body robots should be built for older adults. Our findings highlight the importance of considering factors like co-presence, embodiment, and multi-modal communication. Our work offers design insights to facilitate the integration of on-body robots into daily life and underscores the value of involving older adults in the co-design process to promote usability and acceptance of emerging wearable robotic technologies.